Ancient reindeer mitogenomes reveal island-hopping colonisation of the Arctic archipelagos

Climate warming at the end of the last glacial period had profound effects on the distribution of cold-adapted species. As their range shifted towards northern latitudes, they were able to colonise previously glaciated areas, including remote Arctic islands. However, there is still uncertainty about the routes and timing of colonisation. At the end of the last ice age, reindeer/caribou (Rangifer tarandus) expanded to the Holarctic region and colonised the archipelagos of Svalbard and Franz Josef Land. Earlier studies have proposed two possible colonisation routes, either from the Eurasian mainland or from Canada via Greenland. Here, we used 174 ancient, historical and modern mitogenomes to reconstruct the phylogeny of reindeer across its whole range and to infer the colonisation route of the Arctic islands. Our data shows a close affinity among Svalbard, Franz Josef Land and Novaya Zemlya reindeer. We also found tentative evidence for positive selection in the mitochondrial gene ND4, which is possibly associated with increased heat production. Our results thus support a colonisation of the Eurasian Arctic archipelagos from the Eurasian mainland and provide some insights into the evolutionary history and adaptation of the species to its High Arctic habitat.


Mitogenome sequencing
To reconstruct the phylogeny of reindeer across the whole Holarctic region, investigate the origin and route of colonisation of the Arctic archipelagos by Rangifer tarandus and examine the mitogenomic basis of its adaptation to the arctic environment, we analysed 123 published and 51 newly-sequenced ancient and historical (i.e., 4752-100 years BP) as well as modern whole mitogenomes.The newly-sequenced ancient and historical samples (n = 31) had a mean sequencing depth ranging from 6× to 2131×, and an endogenous content ranging from 3 to 89% (Table S1).Some mitogenomes were excluded due to low quality (see "Methods"), thus a reduced mitochondrial alignment of 166 mitogenomes with a length of 16,349 bp was used to construct a haplotype network.Furthermore, a subset of 83 mitogenomes with a length of 16,351 bp was used to construct a tip-dated Bayesian phylogeny.The data showed that the extinct East Greenland caribou, Svalbard reindeer, and Novaya Zemlya reindeer have the lowest nucleotide diversity, at least an order of magnitude lower than in any other population previously analysed (Table 1).Franz Josef Land is labelled as 'FJL' and Novaya Zemlya as 'NVZ' .Additionally, the map includes close-ups of the three Eurasian Arctic archipelagos Novaya Zemlya, Franz Josef Land and Svalbard.Present-day glaciation is shown in the close-ups.Arrows point out the biggest islands or structures of each archipelago.Black and orange dots point out the approximate known sampling locations.
Table 1.Overview of the nucleotide diversity, haplotype diversity, number of segregating sites and total number of haplotypes in the different geographic populations.Sample sizes are reported for the different sample groups as (n = X).The number following the ± stands for standard deviation.

Haplotype network
The haplotype network analysis revealed three distinct clusters (i.e., I, II and III; Fig. 2).Among the 166 reindeer mitogenomes, a total of 73 distinct mitochondrial haplotypes were identified (Fig. 2).The highest haplotype diversity is found in Canada, which includes several lineages and ecotypes (Table S1).
Cluster I shows a star-like structure and comprises 27 haplotypes represented by 114 Svalbard individuals.Nested within this cluster are two of the three individuals from Franz Josef Land, each represented by its own unique haplotype, and the five individuals from Novaya Zemlya represented by a single haplotype.The haplotype from one ancient Svalbard sample (SB_MLM61; ca.1875 years BP) is at the centre of cluster I.All other cluster I haplotypes stem from this one in a star-like pattern, indicating that it may be ancestral.Closest to the centre of the cluster I are mostly haplotypes found in single ancient individuals.Haplotypes found in modern individuals from Svalbard are located on cluster I's outer branches.Two of the three Franz Josef Land haplotypes and the Novaya Zemlya haplotype group on an outer branch of cluster I with one of the two oldest 14 C-dated samples (SB_MLM82; 3127 years BP), slightly outside of the centre of the network.
Cluster II is the most divergent among the three clusters, showing the highest number of substitutions (i.e.64) between branches.It comprises seven individuals from Canada, each represented by its own unique haplotype, distributed in two smaller branches.Finally, cluster III comprises 32 haplotypes from most sampled populations (i.e., Canada, Finland, Russia, Sweden, Norway, Greenland and Franz Josef Land) across four smaller branches with no obvious geographical structuring.

Bayesian phylogeny and timing of divergence
The Bayesian phylogeny revealed seven clades and 10 major nodes (A-H; Fig. 3) with strong statistical support (PP > 0.92) except for nodes D and E (PP = 0.35-0.36).The earliest branching clade 1 comprises individuals from Canada that belong to the NAL.There is little geographical structure within clades 2-6, and Eurasian and Canadian samples are distributed across the branches.All these clades belong to the BEL.In clade 7, all of the modern, historical and ancient Svalbard individuals form a group that is sister to a clade containing two of the three samples from Franz Josef Land and the five samples from Novaya Zemlya.Clade 3 comprises all four modern Norwegian individuals.The estimated mean ages for the historical samples from Eastern Greenland range from 1968 to 1870 years BP (Table S2) and have 95% highest posterior density (HPD) values ranging from 0 years BP to 4701 years BP.
The earliest split in the phylogenetic tree separates the NAL and the BEL at 69,577 years BP (95% HPD: 50,871-87,202 years BP; node A, Fig. 3).The most recent split between parts of the BEL and the Svalbard, Franz Josef Land and Novaya Zemlya archipelagos is estimated at 20,065 years BP (95% HPD: 14,487-25,666 years BP, node G), and the divergence between Svalbard and Franz Josef Land/Novaya Zemlya is estimated at 8792 years BP (95% HPD: 6526-11,256 years BP, node H).The split between the two Franz Josef Land samples is estimated at 6710 years BP (95% HPD: 5235-8362 years BP, node I), and the first divergence among Svalbard individuals at 5925 years BP (95% HPD: 4415-7585 years BP, node J).The estimated posterior substitution rate for the mitogenomes based on the inferred 14 C dates of ancient samples and of historical and modern samples with a known year of death was 9.4148 × 10 -8 substitutions/site/year (95% HPD Interval: 5.84 × 10 -8 , 1.31 × 10 -7 ).

Positive selection and adaptation to cold environments
We tested for evidence of positive selection in modern cold-adapted Svalbard reindeer using the random effects and Bayesian approaches implemented in MEME and FUBAR, respectively, which estimate the ratio of synonymous and non-synonymous mutations (i.e., ω = dN/dS) at a given site.Using MEME, we found no indication for positive selection in any mitochondrial genes.In contrast, FUBAR showed evidence for positive selection in only a single codon (i.e., position 205; AA Lysine) within gene ND4 of Complex I, with a posterior probability of 0.922 and an ω value of 7.325 (synonymous α = 4.950, non-synonymous β = 36.260).

Discussion
Here, we used 174 ancient, historical and modern mitogenomes to resolve the timing, origin and route of colonisation of the Arctic islands by reindeer and to investigate the genetic basis of adaptation to the Svalbard environment.First, in agreement with previous studies, we identified two distinct mtDNA reindeer lineages: NAL and BEL [23][24][25][26][27] .Second, our dataset revealed Franz Josef Land and Novaya Zemlya as closest sister clades to Svalbard, thereby indicating a recent shared ancestry for these Arctic islands.Furthermore, we found some evidence for positive selection within the mtDNA of the cold-adapted Svalbard reindeer.
Consistent with Taylor et al. 27 our phylogenetic tree identifies the same two mtDNA lineages.Seven Canadian samples that diverged ca.70,000 years BP in our phylogeny most likely belong to the NAL, whereas all other samples belong to the BEL lineage.Furthermore, mitogenomes from different subspecies are found throughout the tree thus showing inconsistencies with the current reindeer taxonomy.Our estimate of the divergence time between the NAL and BEL lineages based on a tip-dated phylogeny of whole mitochondrial genomes (ca.50,000-87,000 years BP) overlaps with the estimate of Klütsch et al. 23 (i.e., 44,600-135,800 years BP) based on microsatellite data and mtDNA control region haplotypes, as well as that of Polfus et al. 24 (i.e., 68,600-173,600 years BP).However, our estimate is older than that of McDevitt et al. 26  (28,100-46,700 years BP) and in stark contrast with that of Yannic et al. 25 (184,000-430,000 years BP).The inconsistency between our estimate and those in these two studies 25,26 is likely due to the fact that they used control region and mtDNA cytochrome b sequences, respectively, whereas we have used whole mitogenome sequences as well as ancient radiocarbon dated samples.In comparison to fossil-based approaches 25 , which use the most recent common ancestor of the Odocoleini for calibration, tip-dating incorporates multiple radiocarbon-dated samples, thus improving accuracy of divergence times and preventing underestimation in cases where clades are older than their oldest known fossil 52,53 .Future approaches using a broader dataset of ancient genomes and dated specimens, especially North American individuals (NAL), may be able to provide more precise information on the timing of divergence of the two lineages.
Moreover, in accordance with the results of Taylor et al. 27 , two individuals from Western Greenland formed a sister group to one of the two Peary caribou individuals (C_28), thus supporting a close relationship between modern Peary caribou and Greenland caribou.In contrast, the second Peary caribou individual (C_27) forms a sister group to the three ancient and extinct Eastern Greenland caribou (R. t. eogroenlandicus) samples, possibly as a result of local adaptation as suggested by 27 .Furthermore, earlier studies found a close relationship between Peary caribou and the extinct Eastern Greenland caribou based on morphological evidence as well as shared haplotypes 37,54 .The three Eastern Greenlandic samples were not radiocarbon-dated but the mean age estimates based on the phylogeny range from 1968 to 1870 years BP, suggesting that they likely are part of the extinct subspecies.This is further supported since they do not branch with the Western Greenlandic samples, which have been identified as belonging to the extant Greenland caribou subspecies (R. t. groenlandicus).However, the molecular age ranges for the three samples from Eastern Greenland overlap with 0 years BP, suggesting a possibility of being modern and belonging to the extant Greenland caribou subspecies.To ultimately confirm the age of these samples, it would be necessary to radiocarbon date them.Nonetheless, simply relying on their phylogenetic placement, it seems likely that they are part of the extinct subspecies.The remaining unresolved clades 2-6 of the phylogeny show little geographic structure, most likely due to admixture or introgression between lineages 55 .The dated phylogeny estimated the divergence time between most of Eurasia and the Arctic islands at ca. 20,000 years BP.This estimate coincides with the end of the LGM ca.20,000-18,000 BP and the beginning of a warmer period that initiated glacial retreat 20,37 .During the Weichselian glaciation (ca.115,000-12,000 BP) reindeer established large populations covering most of northern Eurasia.These populations gradually began to retreat northward at the end of the Holocene, reaching as far as the High Arctic archipelagos 20 .However, since the deglaciation of these Arctic islands began slightly later ca.13,000-10,000 years BP, their environmental conditions at the end of the LGM were not yet suitable for reindeer due to insufficient vegetation cover 37,[56][57][58][59] .The colonisation of the archipelagos therefore must have occurred later in the early Holocene (from 10,000 years BP onwards) when the climate became milder 37,57 .Receding glaciers, warmer temperatures and reduced snow-cover during the early-to mid-Holocene (9000-6000 years BP) facilitated growth of various plants such as mosses, dwarf shrubs, lichens and grasses on the High Arctic archipelagos, thus providing favourable and extended vegetation sources for reindeer 21,[57][58][59] .Consistent with these studies on the past climate and vegetation of the archipelagos, our Bayesian approach estimates the divergence of reindeer of the Eurasian Arctic islands at ca. 8000 years BP.The earliest published evidence for reindeer on Svalbard is based on reindeer faeces that were found in peat deposits radiocarbon-dated to 3800-5000 years BP 59 whereas ancient antlers were recently dated up to 7,080 years BP (M.Le Moullec, B. B. Hansen & M. Martin, unpubl.data, 60 ).Furthermore, ancient antlers from Franz Josef Land were radiocarbon dated at ca. 6400 years BP 21 .Consistent with these dates, our phylogeny supports a divergence between the two ancient samples from Franz Josef Land at ca. 6700 years BP and a divergence between the ancient and modern Svalbard samples slightly more recently at ca. 5900 years BP.For Novaya Zemlya, there are no radiocarbon dated estimates available and our phylogeny only includes two modern individuals.However, given geomorphologic studies on the Eurasian ice-sheet and the close relationship to the other archipelagos revealed by genetic studies, a colonisation at that time would seem plausible 4,37,61 .The close relationship between the contemporary and extinct reindeer populations of Svalbard, Franz Josef Land and Novaya Zemlya are thus consistent with a scenario where Svalbard was colonised from the east.
Surprisingly, one sample from Franz Josef Land is placed in a different clade than the other ones.FJL1b is radiocarbon dated to 3896 years BP and clusters in clade 4 with two samples from Russia.However, this radiocarbon date cannot be further corroborated by damage patterns as the sample was USER-treated.A possible explanation for this clustering may be that FJL1b represents a secondary colonisation, since it is still unclear whether Franz Josef Land had a suitable habitat to sustain a continuous population over time or rather if multiple extinction-colonisation events occurred 21,61,62 .Therefore, despite clustering with Russian samples, FJL1b most likely comes from Franz Josef Land.
Overall, our results are concordant with a previous study 4 that found a shared haplotype among the three High Arctic islands and the Russian mainland, likely supporting an eastern route of colonisation.Additional contemporary evidence of such a dispersal route includes an observation of a live reindeer in Svalbard in 1912 bearing antler markers by the people of Novaya Zemlya, possibly also indicating contact to the mainland and confirming that movement across the ice was indeed possible 33,35 .
In contrast to earlier studies based on transferrin variation, we did not find a close relationship between Peary caribou, other North American caribou and Svalbard reindeer [30][31][32] .A more recent study based on sequences of the mtDNA control region 6 found a shared haplotype between Svalbard reindeer and caribou from Quebec, suggesting a North American colonisation route.However, this haplotype is the same one previously found on the Taimyr Peninsula 37 , thus equally supporting an eastern route of colonisation.
It is worth noting that the absence of ancient mitogenomes from Canada and Siberia in our dataset potentially limits our understanding of the colonisation history of the Arctic islands.For instance, our results show that the clade closest to that of the Eurasian Arctic islands reindeer comprises individuals from Canada and Fennoscandia rather than Russia, thus challenging a unidirectional eastern origin hypothesis.Furthermore, the dispersal history of caribou is complex, with strong evidence for admixture between the NAL and BEL lineages 23,24,27,28 .Nevertheless, evidence based on nuclear genomes 60,63 shows a closer affinity between Svalbard and Russian reindeer and thus supports an eastern route of colonisation, consistent with the interpretation based on our mtDNA data.Moreover, in the considered time frame (since the end of the LGM, ca.10-12 ka BP), a direct colonisation from east or west is a more likely hypothesis than circumpolar dispersal from Taimyr across Northern Siberia, Alaska, Canada and Greenland.
Colonisation of its new High Arctic habitat must have exposed reindeer to a number of selective pressures, due to different environmental and thermal conditions compared to the mainland 38 .To produce and retain heat, reindeer require a number of physiological adaptations for an enhanced thermoregulation.Mitochondria are responsible for energy and heat production and both processes rely on a proton gradient that is generated during the process of oxidative phosphorylation (OXPHOS) 41 .Because caloric intake needs to be allocated to www.nature.com/scientificreports/either energy production (i.e.coupled OXPHOS) or heat generation (i.e.uncoupled OXPHOS), there is a tradeoff between those processes 41,64 .In an arctic environment, mutations in the mtDNA selecting the generation of heat rather than energy can occur and seem likely 38,41,64,65 .The FUBAR method found evidence for positive selection in the gene ND4 of Complex I in the OXPHOS pathway.Complex I is the first and largest of the five complexes that make up the OXPHOS pathway 64 .It is responsible for the provision of electrons enabling the translocation of protons across the inner mitochondrial membrane creating the pH gradient and the subunit ND4 acts as a direct proton pump 41,64 .Hence, a change in sequence in this gene could potentially be of adaptive value, especially in regards to environmental pressures such as cold-stress, hypoxia or nutrient availability 41 .While thermoregulation is a biological function likely under selection in Svalbard reindeer, there are other environmental factors such as nutrient availability that may affect the OXPHOS trade-off 41 .In contrast to cold-adaptation, starvation or a shortage of nutrients may act as a force to generate ATP more efficiently, resulting in the exact opposite effect of enhancing the coupling of energy production in the OXPHOS pathway 41 .Importantly, signatures of positive selection in the mtDNA involving gene ND4 and support for the trade-off hypothesis have been found in several species adapted to different climates and habitats (e.g., fish 41,66 , birds 67 , insects 65 , arachnids 40 ).In light of the environmental conditions of reindeer, it seems possible that the site under positive selection induces an elevated efficiency of OXPHOS uncoupling, resulting in an enhanced heat production.In addition to uncoupling as a mechanism for coldadaptation, mitochondrial proteins may also impact the production of energy and heat through changes in mtDNA copy number, transcription and translation, messenger RNA stability as well as protein stability 38,40 .Camus et al. 65 found that a sequence variation within the mtDNA, such as the present mutation on the positively selected site, may affect the copy number of mitogenomes and gene expression, therefore possibly enabling an adaptation to a cold climate.Further research on gene expression and the determination of mtDNA copy numbers are necessary to prove or rule out a possible impact on energy production through these factors.
In contrast to FUBAR 68 , there was no evidence for positive selection using MEME even though it is considered to be more sensitive because it considers both pervasive and episodic selection 69 .While MEME 70 should show results that are consistent with methods only able to detect pervasive selection such as FUBAR, it is generally more powerful in detecting episodic selection.FUBAR is instead considered more robust than random effect models such as MEME, as it uses a more flexible and less restrictive setting and is thus less sensitive to model specifications 66 .It should be noted that it might be beneficial to study the nuclear genome in context of the uncoupled OXPHOS pathway and cold-adaptation, since Complex I also harbours subunits that are encoded by the nuclear genome and studies suggest that, despite the mtDNA evolving roughly 4-10 times faster, a coevolution of mtDNA and nuclear DNA is to be expected 71 .
Using aDNA, we show how species colonise remote islands in the Arctic and adapt to the environmental conditions of Svalbard.Furthermore, we refine previous divergence dates connected to glacial refugia and range shifts.Whole ancient genomes of reindeer as well as other terrestrial species would be essential to examine in more detail the history of colonisation of the Arctic archipelagos.Finally, additional data from Novaya Zemlya and Franz Josef Land would help to study patterns of adaptation to the conditions similar to Svalbard and to better resolve evolutionary relationships among the three islands.

Sample acquisition
The 31 historical and ancient samples (i.e., antlers, subfossil bones and dental calculus) were collected during several field trips on various sites of Svalbard (n = 17), Eastern Greenland (n = 3), Franz Josef Land (n = 3), Sweden (n = 7) and Norway (n = 1).Raw sequences for most ancient Svalbard samples (n = 16) were previously published by Kellner et al. 72 and are available under the BioProject Accession no.PRJEB60484. 14C dates were calibrated using the IntCal20 calibration curve 73 .The whole mitogenomes from modern individuals comprise previously published samples from Svalbard (n = 96; ENA BioProject PRJEB57293 and PRJEB61721) and Russia (n = 1; ENA BioProject PRJEB57293 and PRJEB61721) as well as newly sequenced genomes from Northern Finland (n = 3), Norway (n = 4) and several locations in Russia (n = 13).Raw sequences for the modern Canadian samples (n = 24) and the modern Western Greenland samples (n = 2) were downloaded from GenBank (NCBI) under the BioProject Accession No. PRJNA634908 27 .

DNA extraction and sequencing
Historical and ancient samples were processed using various protocols in dedicated, positively-pressurised ancient DNA laboratory facilities at the Centre for Palaeogenetics (Stockholm), the NTNU University Museum, and Uppsala University.All procedures were performed using sterilised equipment and working surfaces to minimise the risk of exogenous and cross-sample contamination.For the three samples from Eastern Greenland (HOC#1,2,5) and one sample from Franz Josef Land (FJL1b), DNA was extracted from ca.50 mg of bone or antler powder.To prevent exogenous contamination of the bone powder, bones were wrapped in aluminium foil except for the small drilling area and the drilling bit was changed after removing the outer part of the bone.DNA extraction was carried out following the control protocol described in Dehasque et al. 44 and included two negative controls (extraction blanks), and double-stranded DNA (dsDNA) libraries were built following the protocol of Meyer and Kircher 74 , including two additional negative controls (library blanks) per 10 samples, with an additional incubation step removing deaminated cytosine sites (considered damage patterns) through uracil-DNA-glycosylase and endonuclease VIII (USER) treatment 75 .Lastly, libraries were sequenced on either 2 × 50 bp chemistry on an Illumina SPrime or 2 × 100 bp chemistry on the Illumina NovaSeq S4 at the National Genomics Infrastructure (NGI, Stockholm, Sweden).Four extraction blanks processed alongside the first 20 reindeer samples extracted with this method were sequenced and ran through a metagenomics pipeline, which showed that they contained< 0.01% reads assigned to mammals, indicating no reindeer cross contamination into the blanks.
The seven historical samples from Sweden, one historical sample from Norway and three modern samples from Russia (Table S1) were collected as dental calculus scraped from the individuals' teeth.Surface decontamination consisted of 10 min of UV light exposure and 30 s of EDTA wash, as has been previously described in Brealey et al. 76 .Next, DNA was extracted from ca. 5-20 mg of dental calculus following a silica-column based extraction method 77 .An extraction blank was carried out alongside the samples for every extraction batch.dsDNA libraries for calculus samples, extraction blanks and library blanks (no-template controls) were prepared following van der Valk et al. 78 without USER treatment.Libraries were sequenced using an Illumina HiSeq 2500 platform and using 2 × 150 bp chemistry and an additional two lanes on the same platform using 2 × 100 bp chemistry (SciLifeLab, Uppsala, Sweden).
For the ancient samples B16, B20a, M72, MLM12, MLM16, MLM51, MLM80, and MLM82 from Svalbard, ca.50 mg of bone/antler powder was subjected to a DNA extraction protocol based on a previous study 79 .Briefly, bone powder was agitated for 48 h at 55 °C in 1 mL lysis buffer 80 consisting of 0.1 M urea, 0.45 M EDTA, 0.15 mg/mL proteinase K, and molecular biology H 2 O. 10 µL of 20 mg/mL proteinase K was added prior to the final 20 h of lysis.An extraction blank was carried out alongside the samples for every extraction batch.Then the protocol of Dabney and Meyer 79 was used with bleach-sterilised Zymo-Spin V column extenders.Dual-indexed dsDNA libraries were constructed on the eluted DNA, extraction blanks and a single library blank (no-template control) following the protocol of Meyer and Kircher 74 without USER treatment.These libraries were subjected to target enrichment with Rangifer tarandus mitochondrial DNA baits provided by Arbor Biosciences (Product Code 303048) using the MYbaits protocol v.3.02.In-solution hybridization was performed at 55 °C for 28 h.Thereafter, the pre-and post-capture libraries were sequenced on an Illumina MiniSeq instrument with 1 × 75 bp or 2 × 75 bp chemistry (NTNU University Museum) and an Illumina NextSeq 550 instrument with 1 × 75 bp chemistry (NTNU Genomics Core Facility).
All other historical and ancient samples were extracted from ca. 48-360 mg of bone/antler powder which was drilled after surface decontamination by wiping with bleach, using a silica-pellet based extraction protocol based on Rohland and Hofreiter 81 .Briefly, a digestion buffer consisting of 1.25% (v/v) proteinase K (20 mg/mL), 90% (v/v) EDTA (0.5 M) and 8.75% (v/v) molecular-grade water was used for digestion.Samples were pre-digested in 1 mL digestion buffer for 10 min at 37 °C on a rotor.The samples were spun down, the pre-digest was removed, and 4 mL digestion buffer was added to the samples.Samples were left for digestion on a rotor for 18 h at 37 °C.The lysis buffer was mixed 1:10 with Qiagen PB buffer modified by adding 9 mL sodium acetate (5 M) and 2 mL NaCl (5 M) to 500 mL of Qiagen PB buffer.pH was adjusted to 4.0 using concentrated (37%/12 M) HCl.50 µL in-solution silica beads were added, and samples were left on a rotor for 1 h at ambient temperature to allow binding of the DNA.Silica pellets with bound DNA were purified using Qiagen MinElute purification kit following the manufacturer's instructions and eluted in 65 µL Qiagen EB buffer.An extraction blank was carried out alongside the samples for every extraction batch.Then dual-indexed dsDNA libraries were constructed on the eluted DNA samples, extraction blanks, and a single library blank (no-template control) following the BEST v.1.1 protocol 82 .Thereafter, libraries were sequenced on an Illumina MiniSeq (NTNU University Museum), an Illumina NextSeq 550 (NTNU Genomics Core Facility), or an Illumina NovaSeq 6000 platform using 2 × 150 bp chemistry (Norwegian National Sequencing Centre, Novogene UK).
For the modern samples, DNA was extracted from ca. 20 mg of skin or muscle tissue alongside an extraction blank using a DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) following the manufacturer's instructions.For bone and antler samples minor modifications were applied as previously described in Peeters et al. 83 .Libraries were then either prepared in-house or at Novogene UK.For in-house library preparation, DNA was subsequently sheared to a fragment length of 400 bp using a Covaris ME220 focused ultrasonicator.Next, libraries were prepared from the sheared DNA using a blunt-end, single-tube protocol for double-stranded DNA (BEST v.1.1) 82.For libraries prepared at Novogene UK, DNA was sheared to a fragment length of 350 bp and subsequently endrepaired, A-tailed and ligated with Illumina adapters.Libraries were then either sequenced on an Illumina HiSeq 4000 platform using a 2 × 150 bp set up at the NTNU Genomics Core Facility or on an Illumina NovaSeq 6000 platform using 2 × 150 bp set up at Novogene, UK.

Bioinformatic procedures
Raw sequence reads of the ancient samples were trimmed of adapters and internal barcodes, reads shorter than a length of 30 bp were discarded and paired-end reads were merged using fastp v.0.23.2 84 .Subsequently, the merged reads were mapped to a reference mitogenome of Rangifer tarandus (GenBank Accession: NC_007703.1)using the BWA aln algorithm v.0.7.17 85 .
Next, the alignments were converted from SAM to BAM format and coordinate sorted and indexed using SAMtools v.1.9 86.PCR duplicates were then removed using the samremovedup.pypython script (https:// github.com/ pontu ssk/ samre moved up).For samples that were sequenced with multiple runs, reads were merged using SAMtools merge v.1.9.
Mapped reads were then imported into Geneious Prime v.2022.2.2 (https:// www.genei ous.com) 87 and consensus sequences were called for each sample using the majority call rule with a minimum depth of 5 × per position.Any ambiguous positions remaining as well as positions with no coverage at all were called as "N".Variants and sites where two bases remained in a 50:50 ratio, were visually inspected and called as "N" in the consensus sequences to avoid incorporation of erroneous SNPs.
Endogenous DNA content (defined as the number of reads mapping to the reference mitogenome divided by the total number of reads prior to duplicate removal) for the historical and ancient samples was estimated from the BAM files before duplicate removal using SAMtools flagstat v.1.9.The sequences were then aligned using the MUSCLE v.3.8.425 88 plug-in in Geneious Prime v.2022.2.2 with a maximum number of iterations of 8. Afterwards, remaining indels in the alignment were visually detected and removed.
Modern samples were processed using the GenErode bioinformatics pipeline 89 .Briefly, raw sequences were trimmed of adapters using fastp v.0.22.0 and then mapped to the Rangifer tarandus mitogenome reference using the BWA mem algorithm v.0.7.17.Next, reads were sorted and PCR duplicates were removed using SAMtools v.1.12.The resulting BAM files were then subsampled to an average depth of 93-100 × using SAMtools view v.1.9and subsequently imported into Geneious Prime v.2022.2.2, where consensus sequences were called as described above.
In this study we generated and collated data for a total of 174 mitogenomes.Within this dataset, we constructed a haplotype network based on 166 of these mitogenomes whereas a subset of 83 mitogenomes was used for the construction of a phylogenetic tree (see details below).

Beast phylogeny and haplotype network
To quantify the haplotype diversity and examine the relationships among populations and islands, we built a haplotype network for 166 of the total 174 mitogenomes from all over the Holarctic range.The alignment was then exported in nexus format with a traits block containing a division into the nine sampling regions into PopART v.1.7 90 .Since PopART masks any columns in the alignment with ambiguous or missing sites (e.g., 'N'), several samples (n = 8; four historical Swedish, three ancient Eastern Greenland and one historical Russian) with a high proportion (i.e., x = 27.8%) of such sites were removed to avoid underestimation of haplotype number and an overall loss of resolution.The final dataset was then used to build a haplotype network using the medianjoining option in PopART 90,91 .
Secondly, we built a phylogeny for modern, historical and ancient samples using BEAST v.1.10.4 92 .To facilitate the visualisation of the phylogeny, the number of Svalbard sequences was randomly reduced to a subset of 22 mitogenomes (i.e.; 11 historical and ancient and 11 modern ones), leading to a total of 83 mitogenomes used for the phylogeny.To estimate the molecular age of the undated samples, an 'undated' taxa partition was created using the 'sampling with individual priors' option.The median probability values of the 14 C dated samples (cal BP) were used for the calibrated tip dates.Additionally, ten modern and historical samples from animals put-down either in 1994 or 1910 were also used as tip dates.The evolutionary model used for the mitogenome dataset was determined in jmodeltest2 v.2.1.9 93,94to be HKY with gamma distribution and invariant sites.A constant size model with a strict molecular clock and a clock rate with a uniform distribution and a mean value of 9.4148 × 10 -8 substitutions/site/year (95% HPD Interval: 5.8434 × 10 -8 , 1.3088 × 10 -7 ) was used for the construction of the phylogeny.The substitution rate was estimated in a preliminary run, using only the modern samples and the dated samples with their median 14 C ages as tip dates.The ages of the undated samples were estimated using a uniform prior ranging from 0 to 15,000 years BP.Three independent trees were then run for 100 million generations with sampling every 1000 generations.All BEAST runs were checked visually for convergence (i.e., aiming for a minimum effective sample size for parameters above 200) in Tracer v.1.7.2 95 .Subsequently, the three individual trees and log files were combined after removing 10% burn-in (10 million generations) of each file using LogCombiner v.1.10.4.Next, TreeAnnotator v.1.10.4 was used to summarise the trees from the combined trees file into a single target tree.Lastly, the Bayesian tree was visualised and built in FigTree v.1.4.4 96 .Summary statistics on nucleotide diversity (π), haplotype diversity (Hd) and total number of segregating sites were calculated using DnaSP6 v.6.12.03 97 .

Tests of positive selection
To test for evidence of positive selection in Svalbard reindeer, the reference and annotation for the Rangifer tarandus mitochondrial reference sequence (GenBank Accession: NC_007703.1)were downloaded and imported into Geneious.The annotated reference was then aligned to the modern Svalbard sample sequences (n = 96) using the MUSCLE v.3.8.425 plug-in in Geneious with a maximum number of iterations of 8. Next, non-coding regions and regions coding for tRNA and rRNA were removed manually.
To identify stop codons and remove the three corresponding nucleotides without causing frameshifts, the consensus nucleotide sequence was translated to an amino acid (AA) sequence using the translation tool for vertebrate mitochondrial DNA in Geneious.Due to the overlap in the annotation, protein coding regions of the mitogenomes were then separated into three different alignments.A first alignment 'Reindeer 1' containing genes ND1, ND2, COX1, COX2, ATP8, COX3, ND3, ND4L, ND5 and CYTB; a second alignment 'Reindeer 2' containing genes ATP6 and ND4 and a third alignment 'Reindeer 3' containing gene ND6.Finally, the alignment files were exported from Geneious in fasta format for further downstream analyses.
We identified synonymous and non-synonymous substitutions using the Mixed Effects Model of Evolution (MEME) method 70 and the Fast Unconstrained Bayesian AppRoximation (FUBAR) method 67 available at datamonkey.org 98,99 .For each method the alignment files were uploaded on datamonkey.org and the genetic code was set to vertebrate mitochondrial DNA code.The p-value threshold for MEME was set to 0.05 and the cut-off for the posterior probability for FUBAR was set to 0.9.The resulting FUBAR values for dS and dN were used to manually calculate ω.If ω = 1 the site is considered to evolve without selective pressure (i.e., neutrally), if ω < 1 the site is considered to be under negative selection (i.e., purifying selection) and if ω > 1 the site is under positive selection [100][101][102] .

Figure 1 .
Figure 1.Map of the Holarctic illustrating the two main hypotheses of reindeer colonisation routes to Svalbard.The red arrows indicate the western route hypothesis, and the blue arrows indicate the eastern route hypothesis.Franz Josef Land is labelled as 'FJL' and Novaya Zemlya as 'NVZ' .Additionally, the map includes close-ups of the three Eurasian Arctic archipelagos Novaya Zemlya, Franz Josef Land and Svalbard.Present-day glaciation is shown in the close-ups.Arrows point out the biggest islands or structures of each archipelago.Black and orange dots point out the approximate known sampling locations.

Figure 2 .
Figure 2. Median-joining haplotype network inferred from 166 whole mitogenome sequences of reindeer.The three main haplotype clusters (I-III) are circled in red, yellow and green, respectively.Cluster I and III = BEL and Cluster II = NAL.Each circle within the clusters represents a unique haplotype and its size reflects the number of individuals with that haplotype.Note that the branch length is not scaled to the number of substitutions and that the labels correspond to the first name of the individual with this haplotype appearing in the input file.Each small black circle indicates one nucleotide difference between haplotypes.Colour codes listed in the legend denote the geographic origin of the population.Fennoscandia combines the samples from Sweden, Finland and Norway.Geographic location abbreviation in sample names, SB: Svalbard; FJL: Franz Josef Land; R: Russia; NVZ: Novaya Zemlya; C: Canada; WG: Western Greenland; S: Sweden; N: Norway, F: Finland.Asterisks indicate ancient and historical samples.

Figure 3 .
Figure 3. Bayesian phylogenetic tree based on 83 whole mitogenome sequences from modern, historical and ancient reindeer.To facilitate the visualisation of the phylogeny, the number of Svalbard sequences was randomly reduced to a subset of 22 mitogenomes (i.e.; 11 historical and ancient and 11 modern ones).Asterisks indicate historical and ancient samples.Clades 1-7 are indicated with curly brackets.Clade 1 exclusively includes the individuals of the NAL lineage as indicated by a blue curly bracket and clades 2-7 include the individuals of the BEL lineage as indicated by a red curly bracket.Samples are coloured by geographical location as listed in the legend and labels include the subspecies.Major nodes are labelled alphabetically and their divergence times are listed as mean node age and 95% highest posterior density (HPD) given in thousands of years (ka) in the upper left corner.Posterior probability (PP) node support over 0.72 is indicated by black node circles.PP of the major nodes is > 0.9 except for nodes D and E (0.36 and 0.35 respectively). https://doi.org/10.1038/s41598-024-54296-2 105.nature.com/scientificreports/ the archipelagos using the 'ggplot2' and 'ggOceanMaps' packages (https:// mikko vihta kari.github.io/ggOceanMaps/)105.